Structural insights into thrombolytic activity of destabilase from medicinal leech

Destabilase from the medical leech Hirudo medicinalis belongs to the family of i-type lysozymes. It has two different enzymatic activities: microbial cell walls destruction (muramidase activity), and dissolution of the stabilized fibrin (isopeptidase activity). Both activities are known to be inhibited by sodium chloride at near physiological concentrations, but the structural basis remains unknown. Here we present two crystal structures of destabilase, including a 1.1 Å-resolution structure in complex with sodium ion. Our structures reveal the location of sodium ion between Glu34/Asp46 residues, which were previously recognized as a glycosidase active site. While sodium coordination with these amino acids may explain inhibition of the muramidase activity, its influence on previously suggested Ser49/Lys58 isopeptidase activity dyad is unclear. We revise the Ser49/Lys58 hypothesis and compare sequences of i-type lysozymes with confirmed destabilase activity. We suggest that the general base for the isopeptidase activity is His112 rather than Lys58. pKa calculations of these amino acids, assessed through the 1 μs molecular dynamics simulation, confirm the hypothesis. Our findings highlight the ambiguity of destabilase catalytic residues identification and build foundations for further research of structure–activity relationship of isopeptidase activity as well as structure-based protein design for potential anticoagulant drug development.

Hirudotherapy, or medicinal leech therapy, is an old technique used for medicinal purposes for millennia and continues to be used today in modern hospitals for various diseases such as inflammatory and cardiovascular diseases, thrombosis, and after different surgeries 1 . For this reason, the European medicinal leech, Hirudo medicinalis, is currently subjected to intense genome mining 2,3 as a potential source of bioactive molecules that can be developed as drug leads.
The main therapeutic effects of hirudotherapy come from extremely potent anticoagulation factors secreted by the leech into the wound during feeding. While other known factors prevent thrombi formation, destabilase is the only one has a fibrinolytic effect 4 .
Isopeptidase activity of destabilase manifests itself in the form of a specific hydrolysis of ε-(γ-Glu)-Lys isopeptide bonds (bonds between side-chains of Glu and Lys) in the absence of peptidase function for α-(α-Glu)-Lys Scientific Reports | (2023) 13:6641 | https://doi.org/10.1038/s41598-023-32459-x www.nature.com/scientificreports/ peptide bonds 9 . This enzymatic activity of destabilase is usually detected by cleavage of the ε-(γ-Glu)-Lys dipeptide or its synthetic analog L-γ-glutamine-p-nitroanilide (L-γ-Glu-pNA) 10,11 . Such ε-(γ-Glu)-Lys isopeptide bonds are formed by specific transglutaminases and play an important role in many vital processes 12 , such as blood coagulation mediated by Factor(XIIIa). In this process the covalently stabilized, insoluble fibrin clot is generated by cross-linking of γ-and α-chains of separate fibrin molecules 13 . Due to its isopeptidase activity destabilase dissolutes stabilized fibrin and cleaves the bond formed between side chains of lysine and glutamine residues in the D-dimer (the final product of fibrinolysis) 9 . Predictably, destabilase greatly increases the thrombolysis of both venous and arterial thrombi in rat model 4 and destroys old, preliminarily formed thrombi in in vitro and in vivo experiments 11,14 . This ability makes destabilase a promising thrombolytic drug candidate with reduced side effects and increased effectivity 4 . Despite evident importance, the structure of destabilase remains unrevealed until now. Destabilase remained the only known i-type lysozyme with isopeptidase activity until a number of similar lysozymes were found in other invertebrates (Table 1). Among them are enzymes isolated from the annelid worm (Eisenia andrei) 15 , the sea cucumber (Apostichopus japonicus) 16 , the purple sea urchin (Strongylocentrotus purpuratus) 17 , the sea snail (Haliotis discus discus) 18 and the saltwater clam (Tapes japonica) 19 .
Isopeptidase activity of destabilase as well as of Tapes japonica lysozyme (TjL) is inhibited by the irreversible inhibitors of serine proteases, PMSF (phenylmethylsulfonyl fluoride) or AEBSF (4-(2-Aminoethyl)benzenesulfonyl fluoride), suggesting that Ser is involved in isopeptidase activity 19,20 and thus its active site is similar to those of serine proteases. Previously, basing on weak sequence homology with serine proteases, Ser49 or Ser51 of destabilase were proposed to be involved in isopeptidase activity 20 . Other engaged amino acids were assumed to be Lys58, Lys59, Ser82 and His112, with the last two being conserved in homologous TjL with confirmed isopeptidase activity 19 .
Further 3D homology modeling followed by site-directed mutagenesis revealed that functional centers of isopeptidase and muramidase activities are located close to each other with Ser49/Lys58 responsible for isopeptidase and Glu34/Asp46 for muramidase activity 21 . However, the molecular mechanism for these activities remains unknown.
In the present study, we further explore structural bases of destabilase activities. We report the first highresolution crystallographic structures of destabilase from Hirudo medicinalis, obtained at low sodium concentration at pH 8.0 and high sodium concentration at pH 5.0 with a resolution of 1.4 Å and 1.1 Å, correspondingly. We further discuss the role of the sodium ion as a muramidase and isopeptidase activity inhibitor. Finally, based on phylogenetic analysis supported by molecular dynamics calculations we propose that His112 rather than Lys58 may serve as a general base of the isopeptidase catalytic center.

Results and discussion
Phylogenetic analysis. To get evolutionary insight into relationships between the various i-type lysozymes, we have constructed a phylogenetic tree using the Pfam database 22 (Fig. 1) and plotted on the tree available data about muramidase and isopeptidase activities (Table 1). Lysozymes with isopeptidase activity belong to two distinct branches of the phylogenetic tree: arthropods' and non-arthropods' branches. www.nature.com/scientificreports/ Lysozymes with isopeptidase activity are expected to have a conserved Ser in the active site. The only conserved serine residue in proteins from both branches is Ser82 (here and below we refer to immature destabilase amino acid residues numbering, UniProt ID: Q25091) (Fig. 2). This serine residue was historically the first candidate suspected of belonging to the isopeptidase active site 20 . However, as we show below, Ser82 is buried deep in the protein core and has no access to any protein cavities, which contradicts its engagement in the isopeptidase activity.
The non-arthropods' group is the larger among the two and contains 6 lysozymes with confirmed isopeptidase activity including destabilase. They have conserved Ser49 and/or Ser51 with the sporadic exception of Haliotic discus discus (Fig. 2). These serine residues were previously suggested as the main candidates for catalytic serines of isopeptidase activity 21 . Notably, many other lysozymes from the non-arthropod branch with untested isopeptidase activity have conserved Ser49 and/or Ser51. Those are primary candidates to be examined as serine isopeptidases in future studies.
The arthropods' group is smaller and contains lysozymes with confirmed isopeptidase activity from Procambarus clarkii (Pclysi2) and Scylla paramamosain. These proteins have neither conserved Ser49 nor Ser51. Therefore, non-arthropods and arthropods probably obtained their isopeptidase activity independently during the evolution. In addition, the isopeptidase activity of arthropods may be of non serine protease origin.

Figure 1.
Phylogenetic analysis of Pfam "destabilase" protein family, annotated with information about muramidase and isopeptidase activities and source organisms. An outgroup protein (c-type lysozyme from hen egg Gallus gallus, colored with orange) is added as a distant relative of i-type lysozymes, indicating the root of the phylogenetic tree. A color of the first half of the organism taxonomy name corresponds to the muramidase activity and the second half-to the isopeptidase activity. Green and red colors mean presence and absence of the corresponded activity respectively, black means no information about the corresponding activity. Violet shows arthropods branch of the tree, yellow-the non-arthropods branch. Further on, we refer to these structures as high and low salt structures, correspondingly. Both destabilase structures share overall structure with the published homologous TjL (PDB ID: 2DQA 23 ) and Meretrix lusoria lysozyme (MlL, PDB ID: 4PJ2 24 ): C α -C α RMSD between all four structures is not exceeding 2 Å (Fig. 3).
The DALI program 25 revealed that in addition to mentioned lysozymes a number of lytic transglycosylases (LTs) (PDB IDs: 7LAM 26 , 4C5F 27 , 4OWD, 6FPN 28 , 1QTE 29 ) share high structural similarity with destabilase. Similar to lysozymes, LTs have substrate specificity for the glycosidic linkage of peptidoglycans. While lysozymes act as hydrolytic catalysts, LTs facilitate non-hydrolytic scission required for cell wall transformations in host bacteria 30 .
One of the unique features of destabilase is the high number of cysteine residues. While in average cysteines accounts for 1.38% of all amino acid residues in proteins 31 , destabilase has 14 cysteines per 115 total residues. All of them form disulfide bonds and are arranged in a close agreement with other i-type lysosomes TjL and MlL.
The asymmetric unit contains two protein molecules in essentially the same conformation (C α -C α RMSD equals 0.63 Å) in the case of high salt conditions, while one protein molecule is present per asymmetric unit in low salt structure (Fig. 4).
Structural basis for inhibition of muramidase activity by sodium ions. The C α -C α RMSD between the low salt and both chains of high salt structures is less than 0.8 Å. The main difference between low and high salt structures as well as between the two chains of high salt structure is tied to a flexible loop (Gly44-Ser51), containing enzymatically important Asp46 (Fig. 4a). Together with Glu34, Asp46 forms a catalytic dyad for muramidase activity in i-type lysozymes 23 . In the low salt structure, the Gly44-Ser51 loop is bent outward with Asp46 exposed to the bulk solution (Fig. 4b). Val47 backbone forms a H-bond with an adjacent protein molecule, and therefore crystal packing contributes to stabilization of "open" conformation of the Gly44-Ser51 loop. In the high salt structure, both protein chains possess a more compact Gly44-Ser51 loop adherent to the protein core (Fig. 4c,d). Asp46 has two alternative conformations in both chains and in all cases is oriented towards Glu34.
Both isopeptidase and muramidase activities are inhibited by submolar sodium chloride concentrations 32 . In our high salt structure, both A and B chains contain one sodium ion with 50% occupancy situated in the cleft between Glu34 and Asp46 and stabilized in octahedral coordination with water molecules by Glu34 and Asp46 side chains (Fig. 3b,c). Our low salt structure lacks the sodium ion but contains cryoprotectant glycerol molecule, H-bonded with Glu34 but not Asp46 (Fig. 4b-d).
Notably, the same binding cavity is occupied by the muramidase product inhibitor (Tri(N-acetyl-D-glucosamine) (NAG3)) and proteinaceous lysozyme inhibitor in TjL (PDB ID: 2DQA 23 ) and MlL (PDB ID: 4PJ2 24 ) structures, respectively ( Fig. 3d-g). Both structures have two lysozyme chains, which differ in the conformation of homologous Gly44-Ser51 loop. The inhibitor-bound MlL has more open loop conformations, while inhibitorbound TjL has more closed ones. Such variability points out to the importance of the loop conformational flexibility for capturing the substrate in i-type lysozymes. The inhibitor (whether it is sodium ion in destabilase or NAG3 in TjL) fixes the loop in the closed conformation; this explains the inhibition mechanism.
Remarkably, in addition to its high sodium content, the crystallization sample of the high salt destabilase structure also included a muramidase product inhibitor NAG3. However, electron density maps did not reveal the presence of NAG3 molecule, suggesting that the inhibitory effect of sodium ions was at play. Interestingly, both sodium ions and NAG3 were present in the crystallization conditions of TjL (PDB ID: 2DQA 23 ) and MlL lysozyme (PDB ID: 3AB6 33 ). In contrast to our study, these structures contained NAG3 molecules but no sodium  Table 1. Color of the first half of the organism taxonomy name corresponds to the muramidase activity and the second half-to the isopeptidase activity. Green and red colors mean presence and absence of the confirmed corresponding activity, black means no information is available related to corresponding activity. Amino acids discussed in the manuscript are marked with light blue.  www.nature.com/scientificreports/ ions. This discrepancy may be due to lower sodium concentrations in these structures, which were not sufficient to produce an inhibitory effect under the crystal conditions.
Determination of catalytic residues of isopeptidase activity in destabilase. Serine proteases are the best-known class of proteases 34,35 , which classically contain Ser/His/Asp catalytic triad. Here serine is the nucleophile, histidine is the general base, and aspartate is the acid, helping to orient the histidine residue and neutralize the charge that develops on the histidine during the transition states. However, there are a number of atypical serine proteases which use alternative general base (such as Glu) and acid (Glu or His) or have catalytic dyad instead of triad with Lys or His as a general base 34,36 . In some cases, an individual Ser may serve as the only catalytic residue 34 .
As we have already mentioned in the phylogenetic analysis section, there are three Ser residues, conserved in the non-arthropod branch: Ser49, Ser51 and Ser82. The latter one is buried in the protein, while Ser49/51 are located in the protein cleft and, therefore, are the main candidates for the nucleophile of isopeptidase activity. Given that the sodium ion appears near Ser51 in our high salt structure, we propose this serine to be the nucleophile in destabilase.
General base residue accepts and releases a proton during protease catalytic cycle. Therefore, it is expected to have pK a close to the pH optimum of the enzyme activity, which is between 5 and 7 for destabilase 34 . As we mentioned above, His, Glu and Lys may serve as a general base. There are 4 amino acids of these types in the vicinity of Ser49/51: Glu34, Lys58, Lys59, His112. Additionally, Asp46 may also be a candidate due to its chemical similarity to Glu.
To determine the general base residue among the 5 possible candidates, we performed µs-long unbiased molecular dynamics simulations and estimated pK a of these residues at each frame of the simulation (Fig. 5). Both Lys58 and Lys59 have pK a 's above 10.0 in all accessible conformations. These residues are solvent-exposed and thus their pK a are close to the pK a of a free lysine in solution. pK a 's of carboxylic residues Asp46 and Glu34 are both below 5. In contrast, our molecular dynamics simulations predict the pK a of His112 to be around 6.4, which is within the activity range of the enzyme. www.nature.com/scientificreports/ Thus, we showed that Ser51 is the most probable nucleophile, while His112 is the general base. The only possible candidate for the acid is Glu34 situated in the immediate vicinity of the His112. Thus, we presume the isopeptidase active site of destabilase to have an architecture similar to the Ser-His-Glu triad of serine proteases.
Previously, Zavalova et al. suggested that Ser49-Lys58 dyad, but not Ser51-His112-Glu34 triad, realizes isopeptidase activity 21 . Their conclusions are based on the 3D homology model of destabilase and mutagenesis analysis. This is not confirmed by our results. Destabilase is expressed in soluble form in human Expi293F cell line in our study, while it was refolded from E.coli inclusion bodies by Zavalova et al. 21 . The direct comparison showed (see Fig. 6 in Ref. 37 ) that the latter approach results in a protein with extremely low isopeptidase activity and with circular dichroism (CD) spectra of denatured protein, while the protein from Expi293F shows high activity and reasonable CD spectra. Therefore, we suggest that the contradiction originates from low functional expression in E.coli and mutagenesis experiments should be reproduced with the protein obtained from Expi293F cell line.
The functional triads are highly stabilized by mutual hydrogen bonds in the large majority of serine protease structures with very strong low-barrier hydrogen bonds observed in several cases 38 . Surprisingly, all three amino acid residues in the isopeptidase site of destabilase do not form hydrogen bonds to each other in both high and low salt structures (Fig. 4). We expect it is because both our structures were substrate-free. Given that Ser51 residue is located in the flexible loop, in a substrate-bound state it can approach His112/Glu34, His112 side chain can flip and all necessary hydrogen bonds can be formed.

Conclusions
In the present work we performed phylogenetic analysis of i-type lysozymes from the Pfam database and showed that isopeptidase activity emerged independently in arthropod and non-arthropod branches. Since inhibitors of serine proteases can also inhibit isopeptidase activity of the lysozymes from the non-arthropod branch 19,20 , we propose three conserved Ser residues as candidates for nucleophile of the catalytic active site.
We obtained the first high-resolution X-ray structures of the destabilase in high and low salt conditions at 1.1 and 1.4 Å respectively. These structures provide structural basis for the inhibition of muramidase and isopeptidase activities: a sodium ion occupies the cleft between catalytic dyad Glu34/Asp46 and functionally flexible loop Gly44-Ser51 prevents the substrate from approaching the active site.
New structural information presented here also shows Ser51 to be the most probable nucleophile for isopeptidase activity. Further molecular dynamics simulation with the new structure demonstrates that His112 could serve as the general base instead of the previously proposed Lys58. Our results suggest that together with the adjacent Glu34 and His112, Ser51 could form a Ser-His-Glu active triad for the isopeptidase activity.
Thus, the present study provides structural basis and mechanistic insights into the isopeptidase activity of destabilase. Our findings lay a foundation for further research of the structure-activity relationship of isopeptidase activity as well as structure-based protein design for potential anticoagulant drug development.

Materials and methods
Phylogenetic analysis. Sequence set for analysis was collected from the family PF05497 of Pfam database 22 , release 33.1 at the moment of retrieval (December 2020). Alignment of 936 sequences' fragments from PFAM was used as a base alignment. Bacterial and viral metagenomic sequences together with several other sequences were grouped in the same branch. This branch was excluded from the analysis since it had low support values and did not contain sequences meaningful for the analysis. Two additional sequences were added manually: P00698 (Gallus gallus) as an outgroup representative and Pclysi2 (Procambarus clarkii) as a sequence not present in the Pfam dataset but previously described in literature 39 .
MAFFT v7.475 40 was used to align additional sequences to the Pfam alignment with-add option and L-INS-i algorithm. For phylogenetic tree estimation, TrimAl 41 1.2 was used to remove columns in alignment with gap content more than 90%. Phylogenetic trees were generated using RAxML 42 version 8.2.12 using the Crystallographic data processing and structure refinement. All the data were processed in the XDS software package 45 . The phase problem was solved by molecular replacement in phenix.phaser 46,47 , where the generated poly-ala model of TjL (PDB ID 2DQA 23 ) was used as a starting model. Crystals at high salt conditions have a P1 space group with two molecules per asymmetric unit, while crystals at low salt conditions have a P2 1 space group with one molecule per asymmetric unit. The models were subsequently rebuilt in phenix. autobuild 48 , phenix.refine 46 and COOT 49 . TLS refinement was used in low salt destabilase structure. Final resolution cut-off was determined by paired refinement 50 . The quality of the resulting models was analyzed by phenix. molprobity 51 and Quality Control Check web server 52 . Crystallographic data collection and refinement statistics are given in Supplementary Table 1. The figures were generated using PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC) 53 .
Three sodium ions were identified in the high salt destabilase structure (atoms A:501, A:502 and B:501). While distinguishing sodium ions from water molecules is challenging in many cases due to similar coordination and number of electrons, it is unambiguous in this case. The atoms are six-coordinated with octahedral geometry and bonds in the range 2.3-2.5 Å. This geometry is typical for sodium ions but not for water molecules. CheckMyMetal validation tool 54 confirms correct ions placement. The overall valencies are close to the unit of valence for these atoms that confirms that sodium ions are more chemically sensible in these positions than water molecules.

Molecular dynamics simulations and pKa calculations.
Chain A of the high salt structure was preprocessed to remove alternative conformations and model missing side chains using the energy-based optimization protocols available in ICM-Pro (v.3.8-7) 55 . Starting pyroglutamic acid and ending His-tag were removed. The preprocessed structures were protonated using GROMACS (v. 2020.1) 56 . Since the true protonation state of His112 at physiological pH is unknown, two different protonation states for His112 at ND1-atom were modeled in two separate 1 µs runs. Molecular dynamics simulation was performed using the OPLS-AA force field with SPC/E model of water. The structure was simulated in a periodic cubic box containing 7972/7983 water molecules and 10/11 chloride ions for uncharged/charged run, respectively. After the initial energy minimization, the system was consequently equilibrated in 100-ps NVT and 100-ps NVT phases with position restraining force on the heavy atoms of the protein, followed by production runs for 1 µs with 2 fs timestep using modified Berendsen thermostat and Parrinello-Rahman barostat for both protonation states. Backbone Root Mean Square Deviation (RMSD) of the after least square fit to the starting model backbone for the production runs are shown on the Supplementary Fig. 2. The simulations were performed using GROMACS (v2020.1) simulation package on a working station with a single Nvidia GTX 1050 GPU. pKa calculations were performed using PROPKA 3.2 within propkatraj package 57,58 and plotted using seaborn package 59